Download Professional Data Engineer on Google Cloud Platform.Professional-Data-Engineer.Dump4Pass.2024-12-08.191q.tqb

Vendor: Google
Exam Code: Professional-Data-Engineer
Exam Name: Professional Data Engineer on Google Cloud Platform
Date: Dec 08, 2024
File Size: 1 MB

How to open VCEX files?

Files with VCEX extension can be opened by ProfExam Simulator.

Purchase
Coupon: EXAM_HUB

Discount: 20%

Demo Questions

Question 1
You are building a model to make clothing recommendations. You know a user’s fashion preference is likely to change over time, so you build a data pipeline to stream new data back to the model as it becomes available. 
How should you use this data to train the model? 
 
  1. Continuously retrain the model on just the new data. 
  2. Continuously retrain the model on a combination of existing data and the new data. 
  3. Train on the existing data while using the new data as your test set. 
  4. Train on the new data while using the existing data as your test set.  
Correct answer: B
Question 2
You create an important report for your large team in Google Data Studio 360. The report uses Google BigQuery as its data source. You notice that visualizations are not showing data that is less than 1 hour old. 
What should you do? 
 
  1. Disable caching by editing the report settings. 
  2. Disable caching in BigQuery by editing table details. 
  3. Refresh your browser tab showing the visualizations. 
  4. Clear your browser history for the past hour then reload the tab showing the virtualizations.  
Correct answer: A
Explanation:
Reference: https://support.google.com/datastudio/answer/7020039?hl=en  
Reference: https://support.google.com/datastudio/answer/7020039?hl=en 
 
Question 3
Your weather app queries a database every 15 minutes to get the current temperature. The frontend is powered by Google App Engine and server millions of users. How should you design the frontend to respond to a database failure? 
 
  1. Issue a command to restart the database servers. 
  2. Retry the query with exponential backoff, up to a cap of 15 minutes. 
  3. Retry the query every second until it comes back online to minimize staleness of data. 
  4. Reduce the query frequency to once every hour until the database comes back online.  
Correct answer: B
Question 4
You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you use? 
  1. Linear regression 
  2. Logistic classification 
  3. Recurrent neural network 
  4. Feedforward neural network  
Correct answer: A
Question 5
Your company is using WILDCARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error: 
 
# Syntax error : Expected end of statement but got “-“ at [4:11] SELECT age 
FROM 
bigquery-public-data.noaa_gsod.gsod WHERE 
age != 99 
AND_TABLE_SUFFIX = ‘1929’ ORDER 
BY 
age DESC 
 
Which table name will make the SQL statement work correctly? 
 
  1. ‘bigquery-public-data.noaa_gsod.gsod‘ 
  2. bigquery-public-data.noaa_gsod.gsod* 
  3. ‘bigquery-public-data.noaa_gsod.gsod’* 
  4. ‘bigquery-public-data.noaa_gsod.gsod*`  
Correct answer: D
Explanation:
Reference: https://cloud.google.com/bigquery/docs/wildcard-tables  
Reference: https://cloud.google.com/bigquery/docs/wildcard-tables 
 
Question 6
You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules: 
  • No interaction by the user on the site for 1 hour 
  • Has added more than $30 worth of products to the basket Has 
  • not completed a transaction  
You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline? 
 
  1. Use a fixed-time window with a duration of 60 minutes. 
  2. Use a sliding time window with a duration of 60 minutes. 
  3. Use a session window with a gap time duration of 60 minutes. 
  4. Use a global window with a time based trigger with a delay of 60 minutes.  
Correct answer: C
Question 7
Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each other’s data. You want to ensure appropriate access to the data. Which three steps should you take? (Choose three.) 
 
  1. Load data into different partitions. 
  2. Load data into a different dataset for each client. 
  3. Put each client’s BigQuery dataset into a different table. 
  4. Restrict a client’s dataset to approved users. 
  5. Only allow a service account to access the datasets. 
  6. Use the appropriate identity and access management (IAM) roles for each client’s users.  
Correct answer: BDF
Question 8
You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user base could grow exponentially, but you do not want to manage infrastructure scaling. 
Which Google database service should you use? 
 
  1. Cloud SQL 
  2. BigQuery 
  3. Cloud Bigtable 
  4. Cloud Datastore  
Correct answer: D
Question 9
You need to store and analyze social media postings in Google BigQuery at a rate of 10,000 messages per minute in near real-time. Initially, design the application to use streaming inserts for individual postings. Your application also performs data aggregations right after the streaming inserts. You discover that the queries after streaming inserts do not exhibit strong consistency, and reports from the queries might miss in-flight data. How can you adjust your application design? 
 
  1. Re-write the application to load accumulated data every 2 minutes. 
  2. Convert the streaming insert code to batch load for individual messages. 
  3. Load the original message to Google Cloud SQL, and export the table every hour to BigQuery via streaming inserts. 
  4. Estimate the average latency for data availability after streaming inserts, and always run queries after waiting twice as long. 
     
Correct answer: D
Question 10
Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster. What should you do? 
 
  1. Create a Google Cloud Dataflow job to process the data. 
  2. Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS. 
  3. Create a Hadoop cluster on Google Compute Engine that uses persistent disks. 
  4. Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector. 
  5. Create a Hadoop cluster on Google Compute Engine that uses Local SSD disks.  
Correct answer: D
HOW TO OPEN VCE FILES

Use VCE Exam Simulator to open VCE files
Avanaset

HOW TO OPEN VCEX AND EXAM FILES

Use ProfExam Simulator to open VCEX and EXAM files
ProfExam Screen

ProfExam
ProfExam at a 20% markdown

You have the opportunity to purchase ProfExam at a 20% reduced price

Get Now!